Fix the benchmarks #1001

adamsitnik · 2018-09-24T19:08:24Z

Both I and @davidwrighton run today into an issue with NuGet packages restoring. ML.NET defines the sources in Directory.Builds.props file, but as suggested by @agocke in Exclude Directory.Build.props/targets from generated csproj files BenchmarkDotNet#854 BDN ignores those files and needs the classic nuget.config file to work. (I know it's not perfect)
In different config files for train and predict benchmarks #954 I missed the fact that there is no global config. This change sets the RecommendedConfig as default and when[TrainConfig] is used it overwrites the RecommendedConfig. Few types were missing config ([Config(typeof(SomeConfig))]), and they were using the default config from BenchmarkDotNet
I mentioned how to download external dependencies in the README file

Anipik · 2018-09-24T19:14:59Z

test/Microsoft.ML.Benchmarks/Harness/Configs.cs

+        protected virtual Job GetJobDefinition()
+            => Job.Default
+                .WithWarmupCount(1) // ML.NET benchmarks are typically CPU-heavy benchmarks, 1 warmup is usually enough
+                .WithMaxIterationCount(20);


@adamsitnik I think we should make the default configuration as the train config as most of the tests are running in the train configuration ?

@Anipik thank you for the suggestion! The TrainConfig is a very specific config, I would prefer to not make it a default one

danmoseley · 2018-09-24T23:54:12Z

nuget.config

+<configuration>
+ <packageSources>
+    <!--To inherit the global NuGet package sources remove the <clear/> line below -->
+    <clear />


@eerhardt will this affect anything else, or will they all keep using the MSBuild source?

I'd like to not have this list duplicated. NuGet.configs are not great because they cannot be modified (either adding new sources or removing sources, or changing locations) without actually modifying the file.

Instead, would it be possible to:

pull the RestoreSources property out of Directory.Build.props and put it in a separate file, say build\NuGetSources.props. Directory.Build.props imports that file.

Change the Benchmark toolchain to import the build\NuGetSources.props file.

That way we don't need to duplicate this list.

In reply to: 220023911 [](ancestors = 220023911)

@eerhardt I have changed the way it works: now our project generator generates a simple .csproj file which includes the native dependencies and does not ignore Directory.Build.props file so BDN is using the nuget feeds defined in Directory.Build.props

…s it

…ion which is expensive for long running benchmarks

adamsitnik · 2018-09-25T13:51:39Z

I disabled the MemoryDiagnoser, now it can be enabled on demand via console args: -m or --memory. I did this because MemoryDiagnoser requires one extra iteration, so if benchmark takes 30s to run we needed additional 30s to get the memory diagnostics. So now the long running benchmarks will take less time (cc @justinormont)

… nuget.config file issue

… broken

…, makes all benchmark work again ;)

adamsitnik · 2018-09-26T23:47:58Z

@Anipik I have registered all the required assemblies so the benchmarks work again

@danmosemsft I added one integration test that runs once simple benchmark that has a dependency to native dll and checks if the benchmark executed correctly. I don't want the benchmarks to be broken again ;)

eerhardt

Looks good @adamsitnik. Thanks for enhancing these benchmarks. I just had a few random questions/thoughts.

eerhardt · 2018-09-28T00:23:58Z

test/Microsoft.ML.Benchmarks/Helpers.cs

@@ -19,4 +23,39 @@ internal class EmptyWriter : TextWriter
        internal static readonly EmptyWriter Instance = new EmptyWriter();
        public override Encoding Encoding => null;
    }
+
+    internal static class EnvironmentFactory


(nit) can this go in its own file?

eerhardt · 2018-09-28T01:49:36Z

test/Microsoft.ML.Benchmarks/Helpers.cs

+        {
+            var environment = new ConsoleEnvironment(verbose: false, sensitivity: MessageSensitivity.None, outWriter: EmptyWriter.Instance);
+
+            environment.ComponentCatalog.RegisterAssembly(typeof(TLoader).Assembly);


We could just have a single extension method on IHostEnvironment that just registered all types/assemblies. Then callers wouldn't need to pass in the TTypes.

For example, like this

machinelearning/test/Microsoft.ML.TestFramework/EnvironmentExtensions.cs

Lines 17 to 29 in eb87467

public static TEnvironment AddStandardComponents<TEnvironment>(this TEnvironment env)

where TEnvironment : IHostEnvironment

{

env.ComponentCatalog.RegisterAssembly(typeof(TextLoader).Assembly); // ML.Data

env.ComponentCatalog.RegisterAssembly(typeof(LinearPredictor).Assembly); // ML.StandardLearners

env.ComponentCatalog.RegisterAssembly(typeof(CategoricalTransform).Assembly); // ML.Transforms

env.ComponentCatalog.RegisterAssembly(typeof(FastTreeBinaryPredictor).Assembly); // ML.FastTree

env.ComponentCatalog.RegisterAssembly(typeof(EnsemblePredictor).Assembly); // ML.Ensemble

env.ComponentCatalog.RegisterAssembly(typeof(KMeansPredictor).Assembly); // ML.KMeansClustering

env.ComponentCatalog.RegisterAssembly(typeof(PcaPredictor).Assembly); // ML.PCA

env.ComponentCatalog.RegisterAssembly(typeof(Experiment).Assembly); // ML.Legacy

return env;

}

and you wouldn't need two Create methods. Just one that creates the environment and registers all the assemblies.

eerhardt · 2018-09-28T01:59:16Z

test/Microsoft.ML.Benchmarks.Tests/BenchmarksTest.cs

+
+        private ITestOutputHelper Output { get; }
+
+        [Fact(Skip = SkipTheDebug)]


I'm not sure I fully understand the goal of this test. It is just testing that copying native dependencies in a BDN project works correctly? It won't catch breaks in our actual benchmark tests, right?

I'm sort of wondering if this test is valuable going forward or not.... What kind of changes is it guarding against?

The goal of this test is to make sure that we can build and run the benchmarks. So far they got broken few times, an example: nuget.config file was removed, BDN was ignoring the Directory.Builds.props file which contained the nuget feeds list and it was failing to restore one of the dependencies. Few people pinged me with the exact same problem.

With this test I am confident that there will be no breaking changes for the benchmarks.

I would love to run the benchmarks as part of the test, however, they need a lot of time to execute. Some even 300s for single benchmark invocation

eerhardt · 2018-09-28T02:00:24Z

test/Microsoft.ML.Benchmarks.Tests/Microsoft.ML.Benchmarks.Tests.csproj

+  <ItemGroup>
+    <ProjectReference Include="..\Microsoft.ML.Benchmarks\Microsoft.ML.Benchmarks.csproj" />
+
+    <NativeAssemblyReference Include="FastTreeNative" />


Are we using FastTree in the test?

eerhardt · 2018-09-28T02:01:04Z

test/Microsoft.ML.Benchmarks/Harness/Configs.cs

+        {
+            Add(DefaultConfig.Instance); // this config contains all of the basic settings (exporters, columns etc)
+
+            Add(GetJobDefinition() // jod defines how many times given benchmark should be executed


jod type-o

eerhardt · 2018-09-28T02:04:58Z

test/Microsoft.ML.Benchmarks/Harness/ProjectGenerator.cs

+  </ItemGroup>
+  <ItemGroup>
+    <ProjectReference Include=""{GetProjectFilePath(buildPartition.RepresentativeBenchmarkCase.Descriptor.Type, logger).FullName}"" />
+    <NativeAssemblyReference Include=""CpuMathNative"" />


Is it possible to include all of the <NativeAssemblyReference from the calling BDN project instead of hard-coding them here? That way they only need to be defined in ML.Benchmarks.csproj.

…don't hardcode the dependencies

eerhardt · 2018-09-28T14:58:47Z

test/Microsoft.ML.Benchmarks/Helpers/EmptyWriter.cs

-    {
-        public static string DatasetNotFound = "Could not find {0} Please ensure you have run 'build.cmd -- /t:DownloadExternalTestFiles /p:IncludeBenchmarkData=true' from the root";
-    }
-
    // Adding this class to not print anything to the console.


@Anipik or @adamsitnik - FYI - whoever is in the benchmark tests again. A new class that just came in is the LocalEnvironment, which doesn't print to the console by default. We could remove this class and switch our environment from ConsoleEnvironment to LocalEnvironment. That will work around the issue as well, and is simpler.

adamsitnik added 4 commits September 24, 2018 19:11

move Harness-related code to Harness folder

9c5b40a

make sure that we always use recommended config

05b7d90

mention the external dependencies in the README

bf54bb9

add nuget.config file so BDN can restore all packages

0ce2640

adamsitnik requested review from eerhardt and Anipik September 24, 2018 19:08

Anipik reviewed Sep 24, 2018

View reviewed changes

danmoseley reviewed Sep 24, 2018

View reviewed changes

danmoseley approved these changes Sep 24, 2018

View reviewed changes

adamsitnik added 2 commits September 25, 2018 15:28

add comments to the config so I am not the only person who understand…

c1fb2c0

…s it

don't enable MemoryDiagnoser by default, it requires one extra iterat…

41fb025

…ion which is expensive for long running benchmarks

adamsitnik added 6 commits September 26, 2018 20:36

Merge remote-tracking branch 'upstream/master' into fixBenchmarks

aa45f38

don't add nuget.config file, generate it on the fly when needed by BDN

bf3a85a

generate a .csproj file that will handle both native dependencies and…

b555773

… nuget.config file issue

describe authoring new benchmarks in the docs

c1eb626

add some integration tests that make sure that the benchmarks are not…

10bd93b

… broken

register the right assemblies after recent change of assembly loading…

c340cdf

…, makes all benchmark work again ;)

Anipik approved these changes Sep 26, 2018

View reviewed changes

make Ranking benchmarks work

4112eab

adamsitnik changed the title ~~Fix some of the benchmarks~~ Fix the benchmarks Sep 27, 2018

eerhardt approved these changes Sep 28, 2018

View reviewed changes

code review: split Helpers.cs into multiple files, cleanup the code, …

9fbe544

…don't hardcode the dependencies

adamsitnik merged commit 17ee205 into dotnet:master Sep 28, 2018

eerhardt reviewed Sep 28, 2018

View reviewed changes

ghost locked as resolved and limited conversation to collaborators Mar 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the benchmarks #1001

Fix the benchmarks #1001

adamsitnik commented Sep 24, 2018

Anipik Sep 24, 2018

adamsitnik Sep 26, 2018

danmoseley Sep 24, 2018

eerhardt Sep 25, 2018

adamsitnik Sep 26, 2018

adamsitnik commented Sep 25, 2018

adamsitnik commented Sep 26, 2018

eerhardt left a comment

eerhardt Sep 28, 2018

eerhardt Sep 28, 2018

eerhardt Sep 28, 2018

adamsitnik Sep 28, 2018

eerhardt Sep 28, 2018

eerhardt Sep 28, 2018

eerhardt Sep 28, 2018

eerhardt Sep 28, 2018

	public static TEnvironment AddStandardComponents<TEnvironment>(this TEnvironment env)
	where TEnvironment : IHostEnvironment
	{
	env.ComponentCatalog.RegisterAssembly(typeof(TextLoader).Assembly); // ML.Data
	env.ComponentCatalog.RegisterAssembly(typeof(LinearPredictor).Assembly); // ML.StandardLearners
	env.ComponentCatalog.RegisterAssembly(typeof(CategoricalTransform).Assembly); // ML.Transforms
	env.ComponentCatalog.RegisterAssembly(typeof(FastTreeBinaryPredictor).Assembly); // ML.FastTree
	env.ComponentCatalog.RegisterAssembly(typeof(EnsemblePredictor).Assembly); // ML.Ensemble
	env.ComponentCatalog.RegisterAssembly(typeof(KMeansPredictor).Assembly); // ML.KMeansClustering
	env.ComponentCatalog.RegisterAssembly(typeof(PcaPredictor).Assembly); // ML.PCA
	env.ComponentCatalog.RegisterAssembly(typeof(Experiment).Assembly); // ML.Legacy
	return env;
	}


		private ITestOutputHelper Output { get; }

		[Fact(Skip = SkipTheDebug)]

Fix the benchmarks #1001

Fix the benchmarks #1001

Conversation

adamsitnik commented Sep 24, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamsitnik commented Sep 25, 2018

adamsitnik commented Sep 26, 2018

eerhardt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment